Skip to content

Conversation

@sollhui
Copy link
Contributor

@sollhui sollhui commented Nov 27, 2024

What problem does this PR solve?

We scaling up from three BE nodes to five BE nodes, but from the monitoring perspective, only the previous three nodes have written traffic.
image

This pr aims to ensure load balance after scaling up BE nodes.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@sollhui
Copy link
Contributor Author

sollhui commented Nov 27, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39952 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5a5d2daf270475316e0e988da478b9cba1957358, data reload: false

------ Round 1 ----------------------------------
q1	17610	7830	7281	7281
q2	2049	170	167	167
q3	10580	1070	1160	1070
q4	10569	732	791	732
q5	7640	2729	2638	2638
q6	245	150	146	146
q7	964	629	596	596
q8	9248	1844	1879	1844
q9	6552	6380	6428	6380
q10	7024	2334	2345	2334
q11	461	270	264	264
q12	411	219	226	219
q13	17761	3062	3071	3062
q14	253	228	216	216
q15	583	539	532	532
q16	652	575	583	575
q17	985	627	526	526
q18	7203	6685	6745	6685
q19	1341	1056	992	992
q20	478	182	186	182
q21	3937	3266	3199	3199
q22	382	312	321	312
Total cold run time: 106928 ms
Total hot run time: 39952 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7316	7357	7321	7321
q2	329	235	232	232
q3	2933	2848	2932	2848
q4	2086	1823	1882	1823
q5	5724	5875	5786	5786
q6	241	149	149	149
q7	2297	1944	1946	1944
q8	3509	3682	3615	3615
q9	9158	9238	9013	9013
q10	3631	3615	3570	3570
q11	607	518	522	518
q12	830	625	584	584
q13	17804	3264	3197	3197
q14	305	273	288	273
q15	568	533	524	524
q16	701	667	650	650
q17	1857	1653	1630	1630
q18	8453	7695	7803	7695
q19	1718	1583	1490	1490
q20	2109	1869	1859	1859
q21	5594	5305	5206	5206
q22	641	561	557	557
Total cold run time: 78411 ms
Total hot run time: 60484 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197091 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5a5d2daf270475316e0e988da478b9cba1957358, data reload: false

query1	1287	976	971	971
query2	6247	2078	2080	2078
query3	11062	4178	4161	4161
query4	67880	29038	23654	23654
query5	4902	449	462	449
query6	415	185	197	185
query7	5645	300	285	285
query8	319	229	233	229
query9	9306	2633	2632	2632
query10	484	251	235	235
query11	17334	15321	15940	15321
query12	160	101	116	101
query13	1531	415	427	415
query14	10929	7604	7386	7386
query15	214	187	176	176
query16	7087	452	425	425
query17	1412	570	549	549
query18	1827	292	292	292
query19	213	155	146	146
query20	123	124	127	124
query21	212	109	100	100
query22	4853	4459	4550	4459
query23	34996	34244	34495	34244
query24	5400	2526	2558	2526
query25	483	395	397	395
query26	640	154	153	153
query27	1754	288	283	283
query28	4379	2461	2445	2445
query29	685	421	417	417
query30	219	151	143	143
query31	1027	839	844	839
query32	68	56	61	56
query33	428	284	289	284
query34	916	515	549	515
query35	877	780	770	770
query36	1080	980	956	956
query37	155	79	82	79
query38	4444	4433	4473	4433
query39	1552	1481	1491	1481
query40	200	99	99	99
query41	44	42	43	42
query42	101	97	96	96
query43	547	505	504	504
query44	1185	814	808	808
query45	188	172	169	169
query46	1166	705	703	703
query47	2031	1939	1928	1928
query48	433	327	320	320
query49	729	392	399	392
query50	847	392	396	392
query51	7387	7181	7164	7164
query52	98	86	85	85
query53	249	179	176	176
query54	512	385	391	385
query55	80	76	77	76
query56	260	240	234	234
query57	1305	1179	1150	1150
query58	256	222	233	222
query59	3199	3061	2908	2908
query60	264	249	247	247
query61	109	122	102	102
query62	782	690	679	679
query63	208	181	176	176
query64	1371	679	648	648
query65	3281	3192	3211	3192
query66	720	299	297	297
query67	15884	15617	15599	15599
query68	4029	575	568	568
query69	420	257	253	253
query70	1219	1150	1057	1057
query71	351	271	250	250
query72	6400	4077	3978	3978
query73	753	353	352	352
query74	10354	9109	8989	8989
query75	3391	2661	2656	2656
query76	1813	1017	1033	1017
query77	502	365	354	354
query78	10451	9420	9450	9420
query79	2545	603	615	603
query80	927	419	425	419
query81	544	226	223	223
query82	402	120	130	120
query83	177	154	145	145
query84	298	72	70	70
query85	1060	295	290	290
query86	420	310	298	298
query87	4699	4654	4544	4544
query88	4046	2155	2135	2135
query89	403	303	293	293
query90	2139	187	192	187
query91	132	101	101	101
query92	56	51	49	49
query93	3169	551	561	551
query94	949	292	300	292
query95	350	248	246	246
query96	631	278	275	275
query97	2849	2676	2779	2676
query98	214	204	199	199
query99	1593	1362	1296	1296
Total cold run time: 323267 ms
Total hot run time: 197091 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.56 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5a5d2daf270475316e0e988da478b9cba1957358, data reload: false

query1	0.03	0.02	0.03
query2	0.07	0.04	0.03
query3	0.24	0.07	0.06
query4	1.62	0.10	0.10
query5	0.43	0.42	0.41
query6	1.14	0.66	0.65
query7	0.02	0.01	0.02
query8	0.04	0.03	0.03
query9	0.57	0.53	0.49
query10	0.57	0.56	0.56
query11	0.15	0.11	0.10
query12	0.14	0.12	0.11
query13	0.62	0.60	0.61
query14	2.74	2.76	2.84
query15	0.92	0.83	0.82
query16	0.39	0.38	0.37
query17	1.06	1.05	1.05
query18	0.22	0.22	0.22
query19	2.00	1.89	2.00
query20	0.01	0.02	0.01
query21	15.37	0.60	0.57
query22	2.45	2.20	1.82
query23	17.25	0.92	0.89
query24	3.28	1.03	0.72
query25	0.31	0.19	0.11
query26	0.33	0.13	0.14
query27	0.05	0.05	0.06
query28	11.00	1.10	1.07
query29	12.53	3.28	3.21
query30	0.25	0.06	0.07
query31	2.85	0.38	0.38
query32	3.28	0.47	0.47
query33	3.01	2.98	3.08
query34	17.23	4.49	4.50
query35	4.56	4.53	4.57
query36	0.66	0.50	0.49
query37	0.10	0.06	0.06
query38	0.05	0.04	0.03
query39	0.03	0.02	0.03
query40	0.16	0.12	0.12
query41	0.07	0.03	0.02
query42	0.04	0.02	0.03
query43	0.04	0.03	0.03
Total cold run time: 107.88 s
Total hot run time: 32.56 s

dataroaring
dataroaring previously approved these changes Nov 28, 2024
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 28, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Nov 28, 2024
@sollhui
Copy link
Contributor Author

sollhui commented Nov 28, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40193 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c6bac64fab8efbc91fe28bd7b79e75b52f7db978, data reload: false

------ Round 1 ----------------------------------
q1	17825	7516	7280	7280
q2	2048	181	165	165
q3	10623	1163	1234	1163
q4	10371	770	770	770
q5	7612	2846	2791	2791
q6	239	151	155	151
q7	992	677	626	626
q8	9235	1877	2003	1877
q9	6577	6493	6388	6388
q10	6978	2306	2347	2306
q11	464	268	264	264
q12	413	218	223	218
q13	17788	3036	3067	3036
q14	241	216	211	211
q15	566	533	533	533
q16	667	574	571	571
q17	1027	494	550	494
q18	7603	6689	6673	6673
q19	1354	1088	967	967
q20	470	188	182	182
q21	4006	3213	3332	3213
q22	387	314	322	314
Total cold run time: 107486 ms
Total hot run time: 40193 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7289	7227	7277	7227
q2	336	233	231	231
q3	2900	2911	3032	2911
q4	2149	1862	1775	1775
q5	5683	5764	5652	5652
q6	219	140	144	140
q7	2192	1773	1853	1773
q8	3397	3547	3525	3525
q9	8954	8950	8945	8945
q10	3616	3596	3579	3579
q11	598	494	507	494
q12	819	614	584	584
q13	13666	3196	3166	3166
q14	301	274	263	263
q15	563	521	520	520
q16	671	636	637	636
q17	1808	1615	1572	1572
q18	7985	7496	7452	7452
q19	1679	1627	1614	1614
q20	2094	1821	1799	1799
q21	5252	5293	5288	5288
q22	612	541	543	541
Total cold run time: 72783 ms
Total hot run time: 59687 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191732 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c6bac64fab8efbc91fe28bd7b79e75b52f7db978, data reload: false

query1	970	388	365	365
query2	6529	2077	2046	2046
query3	6699	212	203	203
query4	34274	23638	23571	23571
query5	4348	457	464	457
query6	276	203	181	181
query7	4612	299	300	299
query8	303	228	225	225
query9	9526	2728	2725	2725
query10	487	258	255	255
query11	18239	15442	15285	15285
query12	148	102	106	102
query13	1659	408	424	408
query14	10628	7676	7301	7301
query15	253	175	179	175
query16	8164	424	464	424
query17	1685	583	569	569
query18	2076	298	328	298
query19	349	151	147	147
query20	119	111	115	111
query21	212	104	103	103
query22	4617	4417	4215	4215
query23	35692	34266	34199	34199
query24	11417	2460	2513	2460
query25	658	385	388	385
query26	1778	149	152	149
query27	2793	288	279	279
query28	8304	2442	2423	2423
query29	1011	407	421	407
query30	303	148	151	148
query31	1053	798	819	798
query32	97	54	83	54
query33	768	296	287	287
query34	965	512	525	512
query35	917	749	730	730
query36	1100	944	950	944
query37	283	78	74	74
query38	4499	4280	4289	4280
query39	1493	1430	1406	1406
query40	287	99	98	98
query41	47	43	43	43
query42	113	101	99	99
query43	533	476	492	476
query44	1223	813	821	813
query45	188	160	166	160
query46	1157	722	675	675
query47	1927	1840	1806	1806
query48	423	313	319	313
query49	1288	381	402	381
query50	793	400	390	390
query51	7264	7071	7141	7071
query52	98	89	91	89
query53	261	179	184	179
query54	1179	396	410	396
query55	81	78	79	78
query56	262	246	251	246
query57	1314	1157	1130	1130
query58	230	216	223	216
query59	3226	3010	3045	3010
query60	267	253	248	248
query61	108	122	111	111
query62	862	659	678	659
query63	216	191	186	186
query64	5133	657	632	632
query65	3319	3276	3249	3249
query66	1426	314	311	311
query67	16117	15819	15461	15461
query68	4873	535	542	535
query69	420	263	247	247
query70	1167	1158	1166	1158
query71	338	253	241	241
query72	6330	4116	4030	4030
query73	769	363	371	363
query74	10594	9124	9225	9124
query75	3484	2681	2761	2681
query76	2778	1077	1067	1067
query77	528	275	271	271
query78	10541	9509	9379	9379
query79	2406	610	615	610
query80	1192	429	434	429
query81	550	228	243	228
query82	921	114	112	112
query83	259	153	147	147
query84	242	69	71	69
query85	1414	305	298	298
query86	428	309	306	306
query87	4787	4730	4557	4557
query88	3747	2248	2189	2189
query89	414	308	300	300
query90	2177	190	188	188
query91	139	106	102	102
query92	67	51	49	49
query93	1579	534	532	532
query94	1132	324	293	293
query95	359	251	258	251
query96	609	281	279	279
query97	2822	2668	2736	2668
query98	224	195	196	195
query99	1535	1308	1336	1308
Total cold run time: 307471 ms
Total hot run time: 191732 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.76 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c6bac64fab8efbc91fe28bd7b79e75b52f7db978, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.04
query3	0.23	0.09	0.07
query4	1.61	0.10	0.10
query5	0.44	0.42	0.43
query6	1.16	0.67	0.66
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.57	0.55	0.51
query10	0.56	0.56	0.57
query11	0.14	0.10	0.10
query12	0.14	0.12	0.12
query13	0.61	0.61	0.61
query14	2.83	2.76	2.71
query15	0.93	0.84	0.82
query16	0.39	0.38	0.36
query17	1.01	0.98	1.03
query18	0.22	0.22	0.21
query19	1.89	1.82	1.97
query20	0.01	0.01	0.01
query21	15.37	0.59	0.57
query22	2.75	2.50	1.54
query23	17.01	1.20	0.90
query24	2.87	1.23	0.38
query25	0.24	0.07	0.20
query26	0.37	0.14	0.14
query27	0.04	0.06	0.05
query28	11.14	1.10	1.08
query29	12.55	3.23	3.23
query30	0.25	0.06	0.06
query31	2.87	0.38	0.38
query32	3.29	0.47	0.47
query33	3.03	3.06	3.03
query34	17.00	4.51	4.47
query35	4.50	4.53	4.44
query36	0.64	0.50	0.48
query37	0.09	0.06	0.06
query38	0.05	0.03	0.04
query39	0.04	0.02	0.02
query40	0.16	0.12	0.12
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 107.3 s
Total hot run time: 31.76 s

Copy link
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 29, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit fed46d2 into apache:master Nov 29, 2024
15 of 16 checks passed
github-actions bot pushed a commit that referenced this pull request Nov 29, 2024
…#44693)

We scaling up from three BE nodes to five BE nodes, but from the
monitoring perspective, only the previous three nodes have written
traffic.

![image](https://github.com/user-attachments/assets/947fe8a9-745d-4c37-93d4-15b0e27f12fc)

This pr aims to ensure load balance after scaling up BE nodes.
github-actions bot pushed a commit that referenced this pull request Nov 29, 2024
…#44693)

We scaling up from three BE nodes to five BE nodes, but from the
monitoring perspective, only the previous three nodes have written
traffic.

![image](https://github.com/user-attachments/assets/947fe8a9-745d-4c37-93d4-15b0e27f12fc)

This pr aims to ensure load balance after scaling up BE nodes.
yiguolei pushed a commit that referenced this pull request Dec 2, 2024
… up BE nodes #44693 (#44799)

Cherry-picked from #44693

Co-authored-by: hui lai <laihui@selectdb.com>
dataroaring pushed a commit that referenced this pull request Dec 3, 2024
… up BE nodes #44693 (#44798)

Cherry-picked from #44693

Co-authored-by: hui lai <laihui@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.8-merged dev/3.0.4-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants